2024-12-20

Introduction

The Shiny App I have produced compares the difference in using a linear model as opposed to a quadratic model when the data generated is quadratic.

The formula I used to generate the synthetic data was of form \(y = -0.5x^2 + 10x -18\) and in the range \(x = [3, 18]\). I also generated some gaussian noise to add to the y-term of the form \(y \sim \mathcal{N}(\mu =0, \sigma^2=1)\).

The code to generate this is down below:

x_values <- seq(3, 18, by = 0.1)
y_values <- -0.5 * x_values^2 + 10 * x_values - 18
set.seed(123)
noise <- rnorm(length(y_values), mean = 0, sd = 1)
y_noisy <- y_values + noise
data <- data.frame(x = x_values, y = y_noisy)

Data Visualised

Here is a visualisation of the data generated:

Fitting the models

In the Shiny app we then fit the two models to the data and do a visualisation of the fits as well as the models prediction for a chosen value of x (set by the user).

# Include x^2 term in dataframe for use in quadratic model
data$x2 <- (data$x)^2
fit_linear <- lm(y~x, data = data)
fit_quad <- lm(y~x+x2, data = data)

Visualising the difference between the two models

Residual and Diagnostic Plots of Quadratic fit

The residual plots all appears they should be for each of the diagnostic measures, showing the quadratic was a good fit!